Role of diversification risk in financial bubbles 

Wanfeng Yan f, Ryan Woodard f and Didier Sornette fflj 

t Chair of Entrepreneurial Risks, 

Department of Management, 

Technology and Economics, 

ETH Zurich, CH-8001 Zurich, Switzerland. 

o 

fSj . J Swiss Finance Institute, 

^ ' c/o University of Geneva, 

l/^ I 40 blvd. Du Pont dArve, 

, CH-1211 Geneva 4, Switzerland. 

o 

d 
cr: 






We present an extension of the Johansen-Ledoit-Sornette (JLS) model to include 
an additional pricing factor called the "Zipf factor" , which describes the diversifica- 



^ ' tion risk of the stock market portfolio. Keeping all the dynamical characteristics of 

00 ■ 

cn . a bubble described in the JLS model, the new model provides additional information 

OO 

^— ^ , about the concentration of stock gains over time. This allows us to understand bet- 

l> 

^^ I ter the risk diversification and to explain the investors' behavior during the bubble 

generation. We apply this new model to two famous Chinese stock bubbles, from 
August 2006 to October 2007 (bubble 1) and from October 2008 to August 2009 



H I (bubble 2). The Zipf factor is found highly significant for bubble 1, corresponding 

to the fact that valuation gains were more concentrated on the large firms of the 
Shanghai index. It is likely that the widespread acknowledgement of the 80-20 rule 
in the Chinese media and discussion forums led many investors to discount the risk 
of a lack of diversification, therefore enhancing the role of the Zipf factor. For bubble 
2, the Zipf factor is found marginally relevant, suggesting a larger weight of market 
gains on small firms. We interpret this result as the consequence of the response of 
the Chinese economy to the very large stimulus provided by the Chinese government 
in the aftermath of the 2008 financial crisis. 
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I. INTRODUCTION 



Johansen et al 



• % 



3| developed a model (referred to below as the JLS model) of financial 



bubbles and crashes, which is an extension of the rational expectation bubble model of 

n 

Blanchard and Watson [4]. In this model, a crash is seen as an event potentially terminating 
the run-up of a bubble. A financial bubble is modeled as a regime of accelerating (super- 



exponential power law) growth punctuated 
to the symmetry of discrete scale invariance 



jy short-lived corrections organized according 
5|]. The super-exponential power law is argued 



to result from positive feedback resulting from noise trader decisions that tend to enhance 
deviations from fundamental valuation in an accelerating spiral. 

The JLS model has been proved to be a very powerful and flexible tool to detect financial 
bubbles and crashes in various kinds of markets such as the 2006 - 2008 oil bubble m, the 

n n ^^ 

Chinese index bubble in 2009 l7|, the real estate market in Las Vegas_|8|, the South African 
stock market bubble |9I] and the US repurchase agreement market [10[. Recently, the JLS 
model has been extended to detect market rebounds HI] and to infer the fundamental market 



value hidden within observed prices 12|. Also, new experiments in ex-ante bubble detection 

nn 

and forecast has been performed in the Financial Crisis Observatory at ETH Zurich |13l.ll4j|. 

Here, we present an extension of the JLS model, which is in the spirit of the approach 
developed by Zhou and Sornette 15| to include additional pricing factors. 

The literature on factor models is huge and we refer e.g. to Ref.|16(] and references 

therein for a review of the literature. One of the most famous factor mo del, now considered 

as a standard benchmark, is the three-factor Fama-French model 17H20| augmented by 

ihe momentum factor 2l|]. Recently, the concept of the Zipf factor has been introduced 



22 



23J. The key idea of the Zipf factor is that, due to the concentration of the market 



portfolio when the distribution of the capitalization of firms is sufficiently heavy-tailed as is 
the case empirically, a risk factor generically appears in addition to the simple market factor, 
even for very large economies. Malevergne et al. 22|, |23| proposed a simple proxy for the 
Zipf factor as the difference in returns between the equal-weighted and the value-weighted 



market portfolios. Malevergne et al. 22l, l23|] have shown that the resulting two-factor 



model (market portfolio + the new factor termed "Zipf factor" ) is as successful empirically 
as the three-factor Fama-French model. Specifically, tests of the Zipf model with size and 
book-to-market double-sorted portfolios as well as industry portfolios finds that the Zipf 



model performs practically as well as the Fama-French model in terms of the magnitude 
and significance of pricing errors and explanatory power, despite that it has only two factors 
instead of three. 

In the present paper, we would like to introduce a new model by combining the Zipf factor 
with the JLS model. The new model keeps all the dynamical characteristics of a bubble 
described in the JLS model. In addition, the new model can also provide the information 
about the concentration of stock gains over time from the knowledge of the Zipf factor. This 
new information is very helpful to understand the risk diversification and to explain the 
investors' behavior during the bubble generation. 

The paper is constructed as follows. Section [III describes the definition of the Zipf factor 
as well as the new model. The derivation of the model is presented in this section and the 
appendix. Section IIIII introduces the calibration method of this new model. Then we test 
the new model with two famous Chinese stock bubbles in the history in Section [IV] and 
discuss the role of the Zipf factor in these two bubbles. Section |V] concludes. 

II. THE MODEL 

We introduce the new model in this section. Our goal is to combine the Zipf factor z{t)dt 
with the JLS model of the bubble dynamics. To be specific, we introduce the following 
definition. 

Definition 1: The Zipf factor z{t)dt is defined as proportional to the difference between 

the returns of the capitalization- weighted portfolio and the equal-weighted portfolio for the 

last time step: 

(..,. dp dPe ..s 

where p (respectively pe) is the price of the capitalization-weighted (respectively equal- 
weighted) portfolio, dp := p(t) — p{t — dt) and dpe := Pe{t) — Pe{t — dt). The weights of 
the portfolios are normalized so that their two prices are identical at the day preceding the 
beginning time to of the time series: Pe{to) = p{io)- 



Definition 2: The integrated Zipf factor ({t) is obtained by taking the integral of the Zipf 
factor defined by expression ([I]j; 

C(t):=lnp(t)-lnpe(t) • (2) 



By definition, tiie Zipf factor describes tlie exposition to a lack of diversification due to tlie 
concentration of tlie stock market on a few very large firms. 

The dynamics of stock markets during a bubble regime is then described as 

-yy = fi{t)dt + -fz{t)dt + a{t)dW - ndj , (3) 

p[t) 

where p is the portfolio price, fi is the drift (or trend) whose accelerated growth describes the 
presence of a bubble (see below), 7 is the factor loading on the Zipf's factor and dW is the 
increment of a Wiener process (with zero mean and unit variance). The term dj represents a 
discontinuous jump such that dj = before the crash and dj = 1 after the crash occurs. The 
loss amplitude associated with the occurrence of a crash is determined by the parameter k. 
The assumption of a constant jump size is easily relaxed by considering a distribution of 
jump sizes, with the condition that its first moment exists. Then, the no-arbitrage condition 
is expressed similarly with k replaced by its mean. Each successive crash corresponds to a 
jump of dj by one unit. The dynamics of the jumps is governed by a crash hazard rate h{t). 
Since h{t)dt is the probability that the crash occurs between t and t + dt conditional on the 
fact that it has not yet happened, we have Et[dj] = 1 x h{t)dt + x (1 — h{t)dt), where Ei_[.] 
denotes the expectation operator. This leads to 

Et[dj] = h{t)dt . (4) 

Noise traders exhibit collective herding behaviors that may destabilize the market in this 
model. We assume that the aggregate effect of noise traders can be accounted for by the 
following dynamics of the crash hazard rate 

h{t) = B'{t, - t)"^-' + C'it, - t)™-^ cos(u; ln(t, - t) - 0') . (5) 

The intuition behind this specification ([5]) has been presented at length by Johansen et 



M 



al. [l|-|3|], and further developed by Sornette and Johansen J2^, He and Sornette [25|| and 
Zhou and Sornette ^]. In a nutshell, the power law behavior ~ {t^ — t)"^~^ embodies the 
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mechanism of positive feedback posited to be at the source of the bubbles. If the exponent 
m < 1, the crash hazard may diverge as t approaches a critical time tc, corresponding 
to the end of the bubble. The cosine term in the r.h.s. of ([5]) takes into account the 
existence of a possible hierarchical cascade of panic acceleration punctuating the course of 



and/or 



the bubble, resulting either from a preexisting hierarchy in noise trader sizes 

from the interplay between market price impact inertia and nonlinear fundamental value 

investing 25 1. 

We assume that all the investors of the market have already taken the diversification 
risk into account, so that the no-arbitrage condition reads Ei[-^ — 'yz{t)dt] = 0, where the 
expectation is performed with respect to the risk-neutral measure, and in the frame of the 
risk-free rate. This is the condition that the price process concerning the diversification risk 
should be a martingale. Taking the expectation of expression ([3]) under the filtration (or 
history) until time t reads 

dp 



E, 



P 



'jzdt 



fi{t)dt + a{t)Et[dW] - KEt[dj] . (6) 

), together with the no-arbitrage condi- 



Since Ej[(iiy] = and Ef[(ij] = h{t)dt (equation 
tion Et[(ip(t)] = 0,Vt, this yields 

/i(t) = Kh{t) . (7) 

This result ([7]) expresses that the return /i(t) is controlled by the risk of the crash quantified 
by its crash hazard rate h{t). The excess return /i(t) = nh{t) is the remuneration that 
investors require to remain invested in the bubbly asset, which is exposed to a crash risk. 
Now, conditioned on the fact that no crash occurs, equation (|3]) is simply 

dpit) 



Pit) 



-fz{t) = ij{t)dt + a{t)dW = Kh{t)dt + a{t)dW 



(8) 



where the Zipf factor z{t) is given by expression ([T]). Its conditional expectation leads to 

'dp{t) 



E, 



Pit) 



iz{t) 



Kh(t)dt 



(9) 



Substituting with the expression (|5]) for h{t) and ([T]) for z{t), and integrating, yields the 
log-periodic power law (LPPL) formula as in the JLS model, but here augmented by the 
presence of the Zipf factor, which adds the term proportional to the Zipf factor loading 7: 



Ei[lnp(t) - 7C(t)] = A + B{t, - t)" + C{t^ - t)'" cos{u ln(t, - t) 



(10) 



where ({t) is defined by expression ([2]) and the r.h.s. of (fTOj) is the primitive of expression ([5]) 
so that B = —kB'/itl and C = — KC"/v^m^~+~cI^. This expression flTOj) describes the average 
price dynamics only up to the end of the bubble. The same structure as equation (ITOl) is 
obtained using a stochastic discount factor following the derivation of Zhou and Sornette 
15| . as shown in the appendix. 

The JLS model does not specify what happens beyond tc- This critical tc is the termina- 
tion of the bubble regime and the transition time to another regime. This regime could be 
a big crash or a change of the growth rate of the market. Merrill Lynch EMU (European 
Monetary Union) Corporates Non-Financial Index in 2009 [27] provides a vivid example 
of a change of regime characterized by a change of growth rate rather than by a crash or 
rebound. For m < 1, the crash hazard rate accelerates up to tc but its integral up to t which 
controls the total probability for a crash to occur up to t remains finite and less than 1 for 
all times t < tc- It is this property that makes it rational for investors to remain invested 
knowing that a bubble is developing and that a crash is looming. Indeed, there is still a 
finite probability that no crash will occur during the lifetime of the bubble. The condition 
that the price remains finite at all time, including tc, imposes that m > 0. 

Within the JLS framework, a bubble is qualified when the crash hazard rate accelerates. 
According to ([5]), this imposes m < 1 and B' > 0, hence B < since ?7i > by the condition 
that the price remains finite. We thus have a first condition for a bubble to occur 

0<m<l. (11) 



By definition, the crash rate should be non- negative. This imposes 28 ] 



b = -Bm - \Cym? + ^2 > q . (12) 

III. CALIBRATION METHOD 

There are eight parameters in this LPPL model augmented by the introduction of the 
Zipf's factor, four of which are the linear parameters (7,^4,5 and C). The other four 
(tc, fTt-, w and 0) are nonlinear parameters. 

We first slave the linear parameters to the nonlinear ones. The method here is the same 
as used by Johansen et al. ^]. The detailed equations and procedure is as follows. We 



rewrite Eq. (fTOl) as: 



E[lnp(t)] = 7C(t) + A + Bf{t) + Cg{t) := RHS{t) 



We have also defined 



(13) 



f{t) = (t. - ty 



g{t) = (tc - tr cos(w ln(tc - t) 



(14) 



The minimization of the sum of the squared residuals should satisfy 
dJ:t[lnp{t)-RHS{t)]^ 
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0, V^G{7,Ai?,C}. 



(15) 



The linear parameters 7, A, B and C are determined as the solutions of the linear system of 
four equations: 



/ 



t — 1\ 



\ 



/., \ 



7 
A 
B 



U{t)\Mt)^ 



V*2 



In pit) 

/(t)lnp(t) 

\git)\npit) ) 



(16) 



c^w c(t) c(t)/(t) at)9it) 

m 1 /(t) ^(t) 

mm fit) Pit) fit)git) 

\ at)9it) git) fit)git) g\t) ] 

This provides four analytical expressions for the four linear parameters (7, A, B, C) as a 
function of the remaining nonlinear parameters tc, ""^^ ^) 4>- The resulting cost function (sum 
of square residuals) becomes function of just the four nonlinear parameters tc, m, u, (p. This 
achieves a very substantial gain in stability and efficiency as the search space is reduced to 
the 4 dimensional parameter space itc,m,u,(f)). A heuristic search implementing the taboo 
algorithm 29| is used to find initial estimates of the parameters which are then passed to a 
Levenberg-Marquardt algorithm 30|, |3l| to minimize the residuals (the sum of the squares 
of the differences) between the model and the data. The calibration is performed for the 
time window delineated by [^1,^2], where ti is the starting time and ^2 is the ending time of 
the price time being fitted by expression (ITOl) or equivalently (IT3l) . 
The bounds of the search space are: 



tc e [t2,t2 + 0.375(t2-tl)] 

m e [lO-^l-lO-^] 

u e [0.01,40] 

(p e [0,27r-10"^] 



(17) 
(18) 
(19) 
(20) 



9 

We choose these bounds because m has to be between and 1 according to the discussion 
before; the log-angular frequency u should be greater than 0. The upper bound 40 is large 
enough to catch high-frequency oscillations (though we later discard fits with oj > 20); the 
phase should be between and 2tt; The predicted critical time tc should be after the end 
^2 of the fitted time series. Finally, the upper bound of the critical time tc should not be 
too far away from the end of the time series since predictive capacity degrades far beyond 
^2- Jiang et al. |7l] have found empirically that a reasonable choice is to take the maximum 
horizon of predictability to extent to about one-third of the size of the fitted time window. 

IV. APPLICATION TO THE SHANGHAI COMPOSITE INDEX (SSEC) 

A. Construction of the capitalization-^veighted and equally-^veighted portfolios 

We use the Shanghai Composite Index as the market proxy to test the JLS model aug- 
mented with the Zipf factor. The Shanghai Composite Index is a capital-weighted measure 
of stock market performance. On December 19, 1990, the base value of the Shanghai Com- 
posite Index / was fixed to 100. We note the base date as ts- Denoting by Kb, the total 
market capitalization of the firms entering in the Shanghai Composite index on ts December 
19, 1990, the value p(t) of the Shanghai Composite Index at any later time t is given by 

p{t) = ^ X 100, (21) 

where K{t) is the current total market capitalization of the constituents of the Shanghai 
Composite index. Here, time is counted in units of trading days. Calling Pj(t) (respectively 
Sj{t)), the share price (respectively total number of shares) of firm j at time t, we have the 
total capitalization of firm j at time t 

K,{t)=p,{t)s,{t), (22) 

and the total market capitalization at time t 

M{t) 

K{t) = Y: K,{t) , (23) 

i=i 

where M[t) is the number of the stocks listed in the index at time t. 
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At the time when the cahbrations were performed, the SSEC market included 884 active 
stocks. Since December 19, 1990, 36 firms were dehsted and another 11 were temporar- 
ily stopped. Based on the rule of the index calculation, the terminated stocks are deleted 
from the total market capitalization after the termination is executed, while the last ac- 
tive capitalization of the temporarily stopped stocks are still included in the total market 
capitalization. 

The equal-weighted price Pe entering in the definition of the Zipf factor is constructed 

according to the formula: 

t 



Pe{t) = pito) X exp 



J2 ^^(^ 

i=ti 



(24) 



where ti is the beginning of the fitted window and to is the trading day immediately preceding 
ti . We use this measure of pe to make sure that the equal- weighted price and the value- 
weighted price are identical at to- This implies that ((to) is set to be (recall that ( is 
defined by expression ([2])). The return re{i) is defined by 



1 



M{i) 



rAi] 



M{%) j^^ L jv ; 3\ )\ 



(25) 



In expression ( 125|) . Kj{i) is the total capitalization value of firm j at time i and M{i) is 
the number of the stocks which are listed in the index for both time i and i — 1. Formula 
( l25ll together with (p4l) means that the Zipf factor is a portfolio that puts an equal amount 
of wealth at each time step (by a corresponding dynamical reallocation depending on the 
relative performance of the M{i) stocks as a function of time) on each of the M{i) stocks 
entering in the definition of the Shanghai Composite Index, so that the Zipf portfolio is 
maximally diversified (neglecting here the impact of cross-correlations between the assets). 
Putting expression (125|) inside (^^ yields 



Pe(t) = pito) X n 



i=t^ 






1/M{i)' 



When the number of the stocks remains unchanged from to to t, i.e. 

M{i) =M, Vz G [to,t] , 
expression ( l26l) can be simplified as: 

K,it) 



Pe{t) = p{to) X 



M 

n 



K,{t 



jl'-oj 



l/M 



(26) 



(27) 



(28) 
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showing that Peit) is the geometrical mean of the capitahzations of the stocks constituting 
the Shanghai Composite Index, as compared with the index which is proportional to the 
arithmetic mean of the firm capitalizations. 

B. Empirical test of the JLS model augmented by the Zipf factor 

The Shanghai Composite Index had two famous bubbles in recent history as described 
in Table [H Both of them are tested in this paper. The time series are fitted with both the 
original JLS model and the new model. The 10 best initial guesses from the heuristic search 
algorithm are kept. The results are shown in Figs. [U-El 



Example 


Calibration start at ti 


Prediction start at ^2 


Peak date of the bubble 


Bubble 1 
Bubble 2 


Aug-01-2006 
Oct-31-2008 


Sep-28-2007 
Jul-01-2009 


16-Oct-2007 
Aug-04-2009 



TABLE I: Information on the tested bubbles of SSEC. 

We use the standard Wilks test of nested hypotheses to check the improvement of the 
new factor model. This test assumes independent and normally distributed residuals. The 
null hypothesis is: 

Hq: the original JLS model is sufficient and the new factor model is not necessary. 

The alternative hypothesis reads: 

Hi. The original JLS model is not sufficient and the new factor model is needed. 

For sufficiently large time windows, and noting T the number of trading days in the fitted 
time window [^1,^2], the Wilks log- likelihood ratio reads 



p;^ ^ 2 log; -^^^p/.'""^ ^ 2T In ^"^^^ + '^t=^^'jLsi't) _ ^t=iRzipf(t) 



(29) 

where Rjls and <Jjls (respectively Rzipf and azipf) are the residuals and their corresponding 
standard deviation for the original JLS model (respectively the new factor model). 

In the large T limit, and under the above conditions of asymptotic independence and 
normality, the I^-statistics is distributed with a xl distribution with k degrees of freedom, 
where k is the difference between the number of parameters in two models. In our case. 
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the new factor model has one more parameter, which is 7. Therefore, W in Eg. (1291) should 
follow the xi distribution. 

Only considering the best fit for each of the two models, we obtain a p-value associated 
with the empirical value of the VT-statistics equal to 2.64 x 10^^ for bubble 1 and 0.2517 for 
bubble 2. Thus, the null hypothesis is rejected and the Zipf factor is necessary for the best 
fit of bubble 1, while the null hypothesis is not rejected and the Zipf factor is not necessary 
for the best fit of bubble 2. This result is also consistent with the two values found for 7, 
where 7 = 0.44 for bubble 1 and 7 = —0.028 for bubble 2, showing the Zipf factor in bubble 
1 plays an important role in the improvement of the fit quality. 

Keeping the best 10 fits as we described before increases the statistical power of the Wilks 
test (simply by having more statistical data) and we want to show that the new JLS model 
with the Zipf factor is an significant improvement. For this, we combine all of the residuals 
from the best 10 fits to the data into a large residual sample and calculate the Wilks log- 
likelihood ratio W for this large sample as defined by expression ( !29l) . The corresponding 
p-values are for bubble 1 and 0.0119 for bubble 2. This means the new factor model 
performs better than the original JLS model for both cases when we consider the overall 
qualify of the best 10 fits. 

A natural and interesting test is to find out if the new model with Zipf factor has a better 
predictability of the critical time. To achieve this goal, two examples are fitted by both 
models within different time windows obtained by varying their start time ti and the end 
time ^2- We consider 15 different values of ti and of ^2 in steps of 3 days, yielding 225 time 
series for each example. We keep the best 10 fits for each time series and get 2250 predicted 
critical time tc with each model and for each example. The results in Table HT] show that the 
mean value and the standard deviation of the critical time tc for both models are similar. 
The new model including the Zipf factor neither improves nor deteriorates the predictability 
of the critical time for these two examples. 

However, the new model makes it possible to determine the concentration of stock gains 
over time from the knowledge of the Zipf factor. The two bubbles are found to differ by the 
sign and contribution of the Zipf factor as well as the factor load 7. 

For bubble 1, the integrated Zipf factor ( is positive as shown in Fig. [H corresponding 
to the fact that valuation gains were more concentrated on the large firms of the Shanghai 
index, especially in two periods, Dec. 2006 - Jan. 2007 and Oct. 2007 - Dec. 2007. The 
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Example 


Peak date 


Mean(std) of tc, new model 


Mean(std) of tc, original model 


Bubble 1 
Bubble 2 


16-Oct-2007 
Aug-04-2009 


07-Oct-2007(55.6) 
04-Jul-2009(33.6) 


18-Oct-2007(54.1) 
05-Jul-2009(32.4) 



TABLE II: Prediction of the critical time for both models. For each example, 225 time series are 
generated by varying the start time ti and end time t2 of the windows in which the calibration 
is performed. The mean value and the standard deviation of the predicted critical time tc among 
2250 predictions are shown in the table. 



factor load 7 of the best fit in the example shown in Fig. [T] is 0.44. And the statistics 
of 7 from all the 2250 fits of bubble 1 is shown in the second row of Tab. IIIII All these 
results indicate that the Zipf factor load 7 in bubble 1 is statistically large and positive. 
This implies the existence of a lack-of-diversification premium that contributes significantly 
to the overall price level in addition to the bubble component. 

A possible interpretation of the important of the Zipf factor is based on the importance 
that investors started to attribute to the role of large companies in driving the appreciation 
of the SSEC index during the first bubble. The so-called 80-20 rule started to be hot 
among investors in discussions and interpretation of the rising SSEC index. It was widely 
pointed out that the growth of the SSEC index was driven essentially by 20% of the stocks 
while the other 80% constituents of the index remains approximately fiat (known as the 
80-20 quotation of the Chinese stock market 32] )■ It is plausible that the widespread 
acknowledgement of the 80-20 rule led many investors to discount the risk of a lack of 
diversification, therefore enhancing the role of the Zipf factor. This is consistent with our 
observation that the Zipf factor load 7 is large and positive during the first bubble period. 

In contrast, the integrated Zipf factor ( remained negative over the lifetime of bubble 2 
as shown in Fig. 2, implying that the gains of the Shanghai index were more driven by small 
and medium size firms. The factor load 7 is -0.028 for the best fit shown in Fig. [2] and the 
mean value of 7 for bubble 2 is small and negative (see Tab. IIII|) . The overall contribution 
of the Zipf factor to the stock change is therefore small and negative (due to the product of a 
negative integrated Zipf factor by a negative factor loading), which makes the remuneration 
of investors due to their exposition to the diversification risk still positive but small. 

At the time when bubble 2 started, the world economy has been seriously shaken by the 
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Example 


Meanof'j 


Median of 7 


std of 7 


Bubble 1 
Bubble 2 


0.35 
-0.14 


0.56 
-0.11 


0.43 
0.15 



TABLE III: Statistics of the Zipf factor load 7 from 2250 fit results. Most of the values for 7 
for the period during the development of bubble 1 are positive and their average value is large. 
This means that the Zipf factor plays an important role during the development of bubble 1. The 
concentration of the stock market on a small number of large firms has a significant impact on the 
price change of the stock index. In contrast, for bubble 2, the average value of 7 is relatively small 
and the exposition to the risk associated with a lack of diversification is found to be insignificant 
in pricing the value of the market. 

developing subprime crisis. The demand for Chinese product exports decreased dramatically. 
To compensate for the loss from collapsing exports, the Chinese government launched a 4 
trillion Chinese yuan stimulus with the aim to boost the domestic demand. Small companies 
that are usually more vulnerable to a lack of access to capital profited proportionally more 
than their larger counterpart from this injection of capital in the economy. This is reflected 
in relative better performance of small and medium size flrms in the stock market, leading to 
a slightly negative value of the integrated Zipf factor ( during the development of bubble 2. 
Although the small companies beneflt more, the stimulus was designed to boost the whole 
economy. The diversiflcation risk turned out to be relatively minor at that time, explaining 
the small value of the Zipf factor load. 

V. CONCLUSION 



We have introduced a new model that combines the Zipf factor embodying the risk due 
to lack of diversification with the Johansen-Ledoit-Sornette model of rational expectation 
bubbles with positive feedbacks. The new model keeps all the dynamical characteristics of a 
bubble described in the JLS model. In addition, the new model can also provide information 
about the concentration of stock gains over time from the knowledge of the Zipf factor. This 
new information is very helpful to understand the risk diversification and to explain the 
investors' behavior during the bubble generation. We have applied this new model to two 
famous Chinese stock bubbles and found that the new model provide sensible explanation 
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for the diversification risk observed during these two bubbles. 
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Appendix A: Derivation of the model with stochastic pricing kernel theory 



We present another derivation of the model using the theory of the stochastic pricing 
kernel. Our derivation follows and adapt that presented by Zhou and Sornette 15 |. 

Under this theory, the no-arbitrage condition is presented as follows. The product of 
the stochastic pricing kernel (stochastic discount factor) D(t) and the value process pit), of 
any admissible self-financing trading strategy implemented by trading on a financial asset, 
should be a martingale: 

D{t)p{t) = E,[D{t')p{t% W>t. (Al) 



Let us assume that the dynamics of the stochastic pricing kernel is formulated as: 
dD{t) 



Dit) 



-r{t)dt - -fz{t)dt - X{t)dW + vdW 



(A2) 



where r(t) is the interest rate and z{t) is the Zipf factor defined as ([T]). The process A(t) de- 
notes the market price of risk, as measured by the covariance of asset returns with the 
stochastic discount factor and dW represents all other stochastic factors acting on the 
stochastic pricing kernel. By definition, dW is independent to dW at any time t > 0: 

Et [dW ■ dW] = Et [dW] ■ Et [dW] = , Vt > 0. (A3) 



We further use the standard form of the price dynamics in the JLS model Iml]: 

dp 



— = fidt + a{t)dW — ndj 
P 



(A4) 



where W is the same Brownian motion as in (]A2I) . The term dj represents the jump process, 
valued when there is no crash and 1 when the crash occurs. The dynamics of the jumps 
is governed by the crash hazard rate h{t) defined in ^ with: 

Et[cij] = h{t)dt . (A5) 

According to the stochastic pricing kernel theory, D x p should be a martingale. Taking 
the future time t' in flAl|) as the increment of the current time t, then 



E 



p{t + dt)D{t + dt) - p{t)D{t) 



E 

E 

E 




{p{t) + dp){D{t) + dP) -p{t)D{t) 

p{t)D{t) 
p{t)dD + D{t)dp + dDdp 

mm . 

dD dp dDdp 
D p Dp 



(A6) 
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To satisfy this equation, the coefficient of dt should be zero, that is —r{t) + fi(t) + •yzit) + 
Kh{t) + cr(t)A(t) = 0. This yields 

H(t) = r{t) + -fz{t) - Kh{t) - o{t)\{t) . (AT) 

When there is no crash [dj = 0), the expectation of the price process is obtained by 
integrating (IA4I) : 



Et [Inp(t)] = J (7^(t) + Kh{t) + r{t) + a{t)X{t))dt . (AJ 

For r(t) = and A(t) = 0, we obtain: 



Et[\np{t)] = J {-fz{t) + Kh{t))dt (A9) 

= 7C(i)+ I K.h{t)dt 

= 7C(t) + A + B{t, - tr + C{t, - t)" cos{uln{t, - t) - 0) , 



which recovers fITOl) . 
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FIG. 1: Calibration of the new factor model and the original JLS model to the Shanghai Composite 
Index (SSEC) between Aug-01-2006 and Sep-28-2007. (Upper panel) The beginning of the fit 
interval is the left boundary of the plot, while the end of the fit interval is indicated by the vertical 
thick black dotted line. The real critical time tc when the crash started is marked by the vertical 
magenta dot-dashed line. The historical close prices are shown as blue full circles. The best 10 fits 
of the original JLS model are shown as the green dashed lines and the best 10 fits of the new factor 
model are shown as the red solid lines. (Lower panel) The corresponding Zipf factor (magenta 
solid line with 'x' symbol) and (" function (blue dot-dashed line) during this period. 



22 







OJ 



ijyfcL?!^^^ 



31-Oct-2008 29-Dec-2008 04-Mar-2009 04-May-2009 Ol-Jul-2009 27-Aug-2009 



FIG. 2: Calibration of the new factor model and the original JLS model to the Shanghai Composite 
Index (SSEC) between Oct-31-2008 and Jul-01-2009. (Upper panel) The beginning of the fit 
interval is the left boundary of the plot, while the end of the fit interval is indicated by the vertical 
thick black dotted line. The real critical time tc when the crash started is marked by the vertical 
magenta dot-dashed line. The historical close prices are shown as blue full circles. The best 10 fits 
of the original JLS model are shown as the green dashed lines and the best 10 fits of the new factor 
model are shown as the red solid lines. (Lower panel) The corresponding Zipf factor (magenta 
solid line with 'x' symbol) and (" function (blue dot-dashed line) during this period. 



